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Embedding auxiliary data in a signal. 



FIELD OF THE INVENTION 

The invention relates to a method and arrangement for embedding auxiliary data 
in an information signal, for example, a video signal, an audio signal, or, more generally, 
multimedia content. The invention also relates to a method and arrangement for detecting said 
auxiliary data. 



BACKGROUND OF THE INVENTION 

A known method of embedding auxihary data is disclosed in US Patent 
5,748,783. In this prior art method, an N-bit code is embedded through the addition of a low 
ampUnide wateimark which has the look of pure noise. Each bit of the code is associated with 
an individual watermark which has a dimension and extent equal to the original signal (e.g. 

10 both are a 512x512 digital image). A code bit "1" is represented by adding the respective 

watermark to the signal. A code bit "0" is represented by refraining from adding the respective 
watermark to the signal or, alternatively, by subtracting it from the signal. The N-bit code is 
thus represented by the sum of up to N different watermark (noise) pattems. 

When an image (or part of an image) in, say an issue of a magazine, is 

15 suspected of being an illegal copy of an original image, the original image is subtracted from 
the suspect image and the N individual watermark pattems are cross-correlated with the 
difference image. Depending on the amount of correlation between the difference image and 
each individual watermark pattern, the respective bit is assigned either a "0" or a "1" and the 
N-bit code is retrieved. 



added at the encoding end, and N watermark pattems are to be individually detected at the 
decoding end. 

OBJECT AND SUMMARY OF THE INVENTION 

It is an object of the invention to provide a method and arrangement for 
25 embedding and detecting a watermark which overcomes the drawbacks of the prior art. 

To this end, the invention provides a method of embedding auxiliary data in an 



5 



20 



A drawback of the prior method is that N different watermark pattems are to be 
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informarion signal, comprising the steps of: shifting one or more predetermined watermark 
patterns one or more times over a vector, the respective vector(s) bemg indicative of said 
auxihary data; and embedding said shifted watennark(s) in said information signal. The 
corresponding method of detecting auxiliary data in an information signal comprises the steps 
of: detecting one or more embedded watermarks; determining a vector by which each detected 
watermark is shifted with respect to a predetermined watermark; and retrieving said auxihary 
data from said vector(s). Preferred embodiments of the invention are defined in the subclaims. 

The invention allows multi-bit codes to be accommodated in a single watermark 
pattern or only a few different watermarks patterns. This is important for watermark detection 
in home equipment such as video and audio players and recorders because the watermark 
patterns to be detected must be stored in said equipment. The invention exploits the insight 
that detection methods are available which not only detect whether or not a given watermark is 
embedded in a signal but also provide, without additional computational effort, the relative 
positions of plurahties of said watermark. This is a significant advantage because the number 
of bits that can be embedded in information content is always a trade-off between robustness, 
visibility and detection speed in practice. The invention thus allows real-time detection with 
moderate hardware requirements. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 shows schematically an arrangement for embedding a watermark in a 
signal in accordance with the invention. 

Figs. 2 and 3 show diagrams to illustrate the operation of the embedder which is 
shown in Fig. 1. 

Fig. 4 shows schematically an arrangement for detecting the embedded 
watermark in accordance with the invention. 

Figs. 5, 6 A and 6B show diagrams to illustrate the operation of the detector 
which is shown in Fig. 4. 

Fig. 7 shows a device for playing back a video bit stream with an embedded 

watermark. 

Figs. 8 and 9 show fiirther diagrams to illustrate the operation of embedding and 
detecting multi-bit information in a watermark in accordance with the invention. 

DESCRIPTION OF PREFERRED EMBODIMENTS 

For the sake of convenience, the watermarking scheme in accordance with the 
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invention will be described as a system for attaching invisible labels to video contents but the 
teachings can obviously be applied to any other contents, including audio and multimedia. We 
will hereinafter often refer to this method as JAWS (Just Another Watermarking System). 

Fig. 1 shows a practical embodiment of the watermark embedder in accordance 
with the invention. The embedder comprises an image source 11 which generates an image P, 
and an adder 12 which adds a watermark W to the image P. The watermark W is a noise 
pattern having the same size as the image, e.g. N, pixels horizontally and N. pixels vertically. 
The watermark W represents a key K, i.e. a multi-bit code which is to be retrieved at the 
receiving end. 

To avoid that the watermark detection process needs to search the watermark W 
over the large N,xN, space, the wateraiark is generated by repeating, and if necessary 
truncating, smaller units called "tiles" W(K) over the extent of the image. This "tiling" 
operation (15) is illustrated in Fig. 2. The tiles W(K) have a fixed size MxM. The tile size M 
should not be too small: smaller M implies more symmetry in W(K) and therefore a larger 
security risk. On the other hand M should not be too large: a large value of M implies a large 
search space for the detector and therefore a large complexity. In JAWS we have chosen 
M=128 as a reasonable compromise. 

Then, a local depth map or visibility mask X(P) is computed (16). At each pixel 
position, \(P) provides a measure for die visibility of additive noise. The map X(P) is 
constructed to have an average value equal to 1. The extended sequence W(K) is subsequently 
modulated (17) with A.(P), i.e. the value of the tiled watermark W(K) at each position is 
multiplied by the visibility value of X(P) at that position. The resulting noise sequence W(K,P) 
is therefore dependent on both the key K and the image content of P. We refer to W(K,P) as an 
adaptive watermark as it adapts to the image P. 

Finally, the strength of the final watermark is determined by a global depth 
parameter d which provides a global scaling (18) of W(K,P). A large value of d corresponds to 
a robust but possibly visible watermark. A small value corresponds to an almost imperceptible 
but weak watermark. The actual choice of d will be a compromise between the robustness and 
perceptibihty requirements. The watermarked image Q is obtained by adding (12) 
W=dxW(K4') to P, rounding to integer pixel values and clipping to the allowed pixel value 
range. 

In order to embed the multi-bit code K in the watermark W, every tile W(K) is 
built up from a limited set of uncorrected basic or primitive tiles {W,..W„} and shifted 
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versions thereof, in accordance with 

W(K) = 2]sj^shift(W-,k,.) 

'■j 

where "shiftCWj, kj. ) " represents a spatial shift of a basic M*M tile W; over a vector k with 
cyclic wrap around. The signs se {-1 ,+1 } and the shifts k depend on the key K via an encoding 
function E (13). It is the task of the detector to reconstruct K after retrieving the signs Sj and 
the shifts k,. Note that each basic tile may occur several times. In Fig. 1 , the encoder 13 
generates W(K)=W,+W,-W,' where W,' is a shifted version of W,. Fig. 3 illustrates this 
operation. 

Fig. 4 shows a schematic diagram of a watermark detector. The watermark 
detector receives possibly watermarked images Q. Watermark detection in JAWS is not done 
for every single frame, but for groups of frames. By accumulating (21) a number of frames the 
statistics of detection is improved and therefore also the reliability of detection. The 
accumulated frames are subsequently partitioned (22) into blocks of size MxM (M=128) and 
all the blocks are stacked (23) in a buffer q of size MxM. This operation is known as folding. 
Fig. 5 illustrates this operation of folding. 

The next step in the detection process is to assert the presence in buffer q of a 
particular noise pattern. To detect whether or not the buffer q includes a particular watermark 
pattern W, the buffer contents and said watermark pattern are subjected to correlation. 
Computing the correlation of a suspect information signal q with a watermark pattern w 
comprises computing the inner product d=<q,w> of the information signal values and the 
corresponding values of the watermark pattern. For a one-dimensional information signal 
q={qn} and watermark pattem w={w„} , this can be written in mathematical notation as: 

1 

For the two-dimensional MxM image q={qij} and watermark pattem W={w.j}, the inner 
product is: 
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Pji^^ipl^' vector by which a tile Wi has been shifted can b 
successively applying W; with different vectors k to the detector, and determining for which k 
the correlation is maximal. However, this brute force searching algorithm is time consuming. 
Moreover, the image Q may have undergone various forms of processing (such as translation 
or cropping) prior to the watermark detection, so that the detector does not know the spatial 
location of the basic watermark pattern Wj with respect to the image Q. 

Instead of brute force searching JAWS exploits the structure of the patterns 
W(K). The buffer q is examined for the presence of these primitive patterns, their signs and 
shifts. The correlation dj. of an image q and a primitive pattern w being shifted by a vector k 
(k^ pixels horizontally and ky pixels vertically is: 

The correlation values for all possible shift vectors k of a basic pattern Wj are 
simultaneously computed using the Fast Fourier transform. As shovra in Fig. 4, both the 
contents of buffer q and the basic watermark pattern W; are subjected to a Fast Fourier 
Transform (FFT) in transform circuits 24 and 25, respectively. These operations yield: 

q = FFT(q) and ^ 
w=FFT(w), 

where q and w are sets of complex numbers. 

Computing the correlation is similar to computing the convolution of q and the 
conjugate of W^. In the transform domain, this corresponds to: 

d = q<8>conj(w) 



where the symbol ® denotes pointwise multipUcation and conjQ denotes inverting the sign of 
the imaginary part of the argument. In Fig. 4, the conjugation of w is carried out by a 
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conjugation circuit 26, and the pointwise multiplication is carried out by a multiplier 27. The 
set of con-elation values d={dj is now obtained by inverse Fourier transforming the result of 
said multiplication: 



d = IFFT(d) 



which is carried out in Fig. 4 by an inverse FFT circuit 28. The watermark pattern W,- is 
detected to be present if a correlation value d^ is larger than a given threshold. 

Fig. 6A shows a graph of correlation values d, if the presence of watermark 
pattern W, (see Figs. 1 and 3) in image Q is being checked. The peak 61 indicates that W, is 
indeed found. The position (0,0) of this peak indicates that the pattem W, applied to the 
detector happens to have the same spatial position with respect to the image Q as the pattem 
W, applied to the embedder. Fig. 6B shows the graph of correlation values if waieraiark 
panem W, is applied to the detector. Two peaks are now found. The positive peak 62 at (0,0) 
denotes the presence of watermark W„ the negative peak 63 at (48,80) denotes the presence of 
watermark -W,'. The relative position of the latter peak 63 with respect to peak 62 (or, what is 
similar, peak 61) reveals the relative position (in pixels) of W^' with respect to W,, i.e. the 
shift vector k. The embedded data K is derived from the vectors thus found. 

The embedded information may identify, for example, the copy-right holder or 
a description of the content. In DVD copy-protection, it allows material to be labeled as 'copy 
once', 'never copy', 'no restriction', 'copy no more', etc. Fig. 7 shows a DVD drive for 
playing back an MPEG bitstream which is recorded on a disc 71. The recorded signal is 
applied to an output terminal 73 via a switch 72. The output terminal is connected to an 
external MPEG decoder and display device (not shown). It is assumed that the DVD drive may 
not play back video signals with a predetermined embedded watermark, unless other 
conditions are fulfilled which are not relevant to the invention. For example, watermarked 
signals may only be played back if the disc 71 includes a given "wobble" key. In order to 
detect the watermark, the DVD drive comprises a watermark detector 74 as described above. 
The detector receives the recorded signal and controls the switch 72 in response to whether or 
not the watermark is detected. 

The evaluation circuit 29 (Fig. 4) records one or more triples S = {(ij.S;. ,kj. )} 
for each primitive watermark pattem W; applied to the watermark detector. Herein, ij 
represents the index of the primitive pattem, s its sign, and k its position with respect to the 
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applied pattern. From these data the embedded key K is derived. 

A multi-bit code can be embedded in a single shifted watemiark pattern (e.g. the 
pattern W.' shown in Fig. 3), provided that the corresponding basic watermark pattern (W,) 
apphed to the detector has the same position with respect to the image as in the embedder. In 
5 that case, the coordinates of the peak in the correlation matrix (i.e. peak 63 in Fig. 6B) 

unambiguously represent the vector k. In practice, however, the absolute position of a peak in 
the array of correlation values corresponding with a given basic watermark may vary, due to 
cropping or translation of images. The relative positions of multiple peaks, however, are 
translation and cropping invariant. In view hereof, it is advantageous to embed multiple 
10 watermarks and encode the key K into their relative positions. Preferably, one of the peaks 
provides a reference position. This can be achieved by embedding a predetermined unshifted 
watermark (cf W, which provides reference peak 61 in Fig. 6 A) or embedding one of the 
multiple watermarks with a different sign (cf W. which provides reference peak 62 in 
Fig. 6B). 

15 A mathematical analysis of the number of bits that can be embedded will now 

be given. More generally, we will assume that we have n basic watermark tiles W,..W„, all of 
the same fixed size MxM, and mutually uncorrelated. M is of the form M=2'" for an integer m. 
Typically, we have M=128=2\ Practically feasible numbers of different basic patterns to be 
appUed are presently small: we may for instance think of n=4 or n=8. The exact location of a 

20 peak is only accurate up to a few pixels. Therefore, to embed information in relative shifts of 
peaks, we use a courser grid for allowed translations of basic watermark patterns. We will 
consider grids of size GxG, where G=2^ for an integer g smaller than m. The grid spacing is 
h=M/G. 

We will first consider the number of bits that can be embedded in n different 
25 basic watermark patterns (W,..WJ, the peak of one of which (say WJ is used to provide a 
reference position. In this case, we embed the information in the relative positions of W2,.W„ 
with respect to Wj. For each of these patterns W2..W„, we have G' possible shifts (i.e. 2g bits). 
The information content which can be embedded in the relative shifts of n watermark patterns 
on a GxG grid equals 2g(n-l) bits. The following table I shows these numbers of bits for 
30 various grid sizes and numbers of basic patterns. In this table, we assume that the watermark 
patterns are of size 128 x 128. 
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h 


GxG 


n=2 


n=3 


n=4 


n=5 


n=6 


16 


8x8 


6 


12 


18 


24 


30 


8 


16x16 




16 


24 


32 


40 


4 


32x32 


10 


20 


30 


40 


50 



Table I: The number of bits that can be embedded using the shifts on n watermarks on grids . 
of spacing 16, 8 and 4. 

A grid spacing h of 4 pixels seems to be a feasible choice given the current 
precision of peak detection. When scalings have to be taken into account, perhaps larger 
spacings are required. The number of watermarks that can be applied may be as high as 4 or 
even 6 when it comes to visibility. Robusmess need not always be a big issue with, say 4 basic 
patterns, but detection complexity still is. It is therefore of interest to investigate the situation 
where we use different shifts of just one basic pattern. 

We will also consider the number of bits that can be embedded in n translated 
versions of only one basic panem This has the advantage that we only need to apply one 
pattern to the detector to determine n correlation peaks. It reduces the complexity of detection 
by a factor n, when compared to the situation where n different patterns are being used. We 
will see that this is at the expense of some information content, but that reduction factor is 
15 considerably less than that in detection time. There are two important differences when we 
compare using n shifts of the same watermark with using n different watermarks: 

All shifts must be different. This is not required when different patterns are 
used. 

There is no reference position, as opposed to the situation described above 
20 where we 'fixed' W,, and considered relative positions of other watermarks 

(W2,W2') with respect to the position of Wi. 

Fig. 8 shows examples of peak patterns on an 8x8 grid (h=16) in the case that a 
basic watermark pattern Wj has been embedded 3 times, with different shifts. The peak pattern 
81 shows the positions of the 3 peaks as detected by the watermark detector. Note that cyclic 

25 shifts of this peak pattern may result firom the same watermark. For example, the peak patterns 
82, 83 and 84 (in which one of the peaks is shifted to the lower-left comer) are all equivalent 
to the peak pattern 81. Fig. 9 shows a similar peak pattem for 4 shifted versions of a single 
basic watermark pattem Wj. In this case, all shifted versions of the peak pattem with one peak 
in the lower left comer are identical. 

30 To determine the exact information content, we need to~ count all possible 

different patterns up to cyclic shifts. The inventors have carried out these calculations. The 
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result is listed in the following table 11. 



h 


GxG 


n=2 


n=3 


n=4 


n=5 


n=6 


16 


8x8 


5 


9 


13 


16 


20 


. 8 


16x16 


7 


13 


19 


25 


- 30 


4 


32x32 


9 


17 


25 


33 


40 



Table II: The number of bits that can be embedded by using n shifted versions of one 
watermark pattem on grids of spacing 16, 8 and 4. 

The methods described above can be combined in several ways. For instance, 
one can use multiple shifted versions of different pattems, or one can use sign information in 
combination with shifts, etc. 

Thus, the invention is based on the invariance properties of a watemiark method 
that is based on embedding n basic watermark pattems. The detection method in the Fourier 
domain enables the watermark to be found in shifted or cropped versions of an image. The 
exact shift of a watermark pattem is represented by a correlation peak, obtained after inverting 
the Fast Fourier Transform. The invention exploits the insight that, since the exact shift of the 
watermark is detected, this shift can be used to embed infonmation. The invention allows 
watermark detection to be used, in a cost-effective manner, for embedding multi-bit 
information rather than merely deciding whether an image or video is watemiarked or not. 

In summary, a method is disclosed for embedding auxiliary data in a signal. The 
data is encoded into the relative position or phase of one or more basic watemiark pattems. 
This allows multi-bit data to be embedded by using only one or a few distinct watemiark 
pattems. 



